Target Cost of F0 Based on Pol Concatenative Speec
نویسندگان
چکیده
This paper proposes a target cost function for F0 based on polynomial regression for use in concatenative speech synthesis. Polynomial regression is used to express the time series of F0 continuously, and remove effects of microprosody. We conducted a perceptual experiment and confirmed that the proposed function provides a higher correlation with perceptual scores than does the conventionally used cost function.
منابع مشابه
Objective Distance Measures for S Concatenative Speec
In unit selection based concatenative speech systems, join cost, which measures how well two units can be joined together, is one of the main criteria for selecting appropriate units from the inventory. The ideal join cost will measure perceived discontinuity, based on easily measurable spectral properties of the units being joined, in order to ensure smooth and natural-sounding synthetic speec...
متن کاملQuasi-syllabic and quasi-articula concatenative speec
In this paper we propose methods of speech segmentation and unit characterization which are motivated by prosodic and physiological principles. In particular, we motivate and describe algorithms for unit-database creation on the basis of quasi-syllables and quasi-articulatory-gestures defined and parameterized purely by acoustic measurements. This approach is intended to overcome the burden of ...
متن کاملComparing spectral distance measures for join cost optimization in concatenative speech synthesis
In concatenative synthesis the join cost function can be related to the probability of a perceived discontinuity at the join. Therefore it is important that the distance measures in the cost function correlate highly with human perceived discontinuities. In this paper the results of a listening test on joins in two Norwegian long vowels: /A:/ and /e:/, is presented. Five spectral distance measu...
متن کاملEmotion conversion using F0 segment selection
This paper describes F0 segment selection, a novel syllablebased F0 conversion method, which provides a concatenative framework to search for F0 segments in a modest corpus of emotional speech (∼15 minutes of data). The method is compared with our earlier work on F0 generation using contextsensitive syllable HMMs. Both methods are complemented with a duration conversion module as well as GMM-ba...
متن کاملUse of pitch pattern improvement in the CHATR speech synthesis system
A corpus-based concatenative speech synthesis system using no signal processing can produce intelligible synthetic speech maintaining original voice characteristics, but it can sometimes be di cult to realize natural prosody. In such a concatenative system, it is very important to select appropriate waveform segments that are naturally close to the target prosody. This paper describes some appr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003